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Abstract 

We have compared genomes of Alteromonas macleodii "deep ecotype" isolates from two deep Mediterranean sites and two surface 
samplesfromtheAegean and the English Channel. A total of nine different genomes were analyzed. They belong to five clonal frames 
(CFs) that differ among them by approximately 30,000 single-nucleotide polymorphisms (SNPs) over their core genomes. Two of the 
CFs contain three strains each with nearly identical genomes (~ 1 00 SNPs over the core genome). One of the CFs had representatives 
that were isolated from samples taken more than 1 ,000 km away, 2,500 m deeper, and 5 years apart. These data mark the longest 
proven persistence of a CF in nature (outside of clinical settings). We have found evidence for frequent recombination events between 
or within CFs and even with the distantly related A. macleodii surface ecotype. The different CFs had differentf lexible genomic islands. 
They can be classified into two groups; one type is additive, that is, containing different numbers of gene cassettes, and is very variable 
in short time periods (they often varied even within a single CF). The other type was more stable and produced the complete 
replacement of a genomic fragment by another with different genes. Although this type was more conserved within each CF, 
we found examples of recombination among distantly related CFs including English Channel and Mediterranean isolates. 
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Introduction 

The process of genome variation in prokaryotic cells in nature 
remains largely unknown. Cells reproducing clonally can 
suffer mutations and intragenomic recombination both, legit- 
imate (mediated by recA and among similar DNA sequences) 
or illegitimate due to insertion and deletion of mobile genetic 
elements that does not require sequence homology. 
Prokaryotes do not have meiosis or sexual reproduction but 
have widespread recombination phenomena that can take 
place between individuals of very different genomic back- 
ground (horizontal gene transfer). The development of high- 
throughput sequencing technologies has opened a new 
window for studying the variation of bacterial genomes 
through time. Experimental evolution in the laboratory indi- 
cate that genomes of bacteria can remain nearly unaltered 
through large numbers of generations (Conrad et al. 2009; 
Charusanti et al. 2010; Kishimoto et al. 2010). A suggestive 
figure is that of Barrick et al. (2009) in which only 29 



single-nucleotide polymorphisms (SNPs) were found in a 
long-term adaptive evolution study of Escherichia coli evolved 
in glucose minimal medium after 20,000 generations. 
However, in all these cases, the situation departs largely 
from nature. First, the level of complexity that contributes to 
evolution in natural environments, such as the interaction with 
other populations that might act as donors of genetic mate- 
rial, the changes in nutrient availability, and physicochemical 
conditions, cannot be reproduced in experiments in controlled 
environments. Despite these limitations, laboratories experi- 
ments have significantly improved knowledge about the 
mechanisms underlying adaptive evolution in bacteria (for re- 
views see Conrad et al. [201 1]). 

Understanding of bacterial genome evolution in natural 
environments is more limited and has been largely restricted 
to pathogens (Morelli et al. 2010; Nubel et al. 2010; Mutreja 
et al. 201 1 ; Reeves etal. 201 1). There is information about the 
variation of pathogenic strains throughout epidemic 
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outbreaks from host to host, but this again is a very special 
situation that, although extremely important for human 
health, does not help much in understanding the evolution 
of free living bacteria. One way to approach this topic is the 
study of isolates from similar environments that have space- 
time continuity, so that the cells represented by the isolated 
strains can be considered representatives of the same popu- 
lation sampled at different times and locations. A report using 
this approach for Vibrio cyditrophicus isolates from different 
communities associated to different size particles (Vergin et al. 
2007; Shapiro et al. 2012) indicated that the microbe was 
highly recombinogenic, although recombination was more 
frequent among close genomes. 

Alteromonas madeodii is a gamma-proteobacterium com- 
monly found in temperate waters around the world (Sass et al. 
2001; Lopez-Lopez et al. 2005). Mesocosm studies (Schafer 
et al. 2006) and metatranscriptomic data (McCarren et al. 
2010; Shi et al. 2012) have provided new insights into the 
relevance of this microbe as an opportunistic strategist 
when nutrient availability increases in oligotrophic conditions. 
Recently, we have used a metagenomic fosmid library and 
two A madeodii strain genomes (AltDEand AltDEI), obtained 
from the same seawater sample, to analyze the genomic di- 
versity present at this single time/space point (Gonzaga et al. 
201 2). The two strains had a relatively conserved core genome 
(98.6% average nucleotide identity [ANI]) but differed widely 
in the gene content of several flexible genomic islands (fGIs) 
located at equivalent genomic locations. Many of these fGIs 
were involved in the synthesis of cell surface structures, such 
as the flagellum or the lipopolysaccharide O-chain and might 
change phage sensitivity, as found in other cases of fGIs 
(Rodriguez-Valera et al. 2009; Avrani et al. 2011). The aim 
of this study was to analyze the genomic diversity of another 
set of isolates belonging to the deep ecotype clade from dif- 
ferent Mediterranean locations and isolation times (plus the 
single Atlantic isolate available). The genomes have been fully 
sequenced and assembled, and the results provide a snapshot 
of how variation happens at the microdiversity level in this 
marine bacterium in nature. Remarkably, we have found 
two nearly identical genomes separated by more than 
1,000 km, 2,500 m depth, and 5 years, the two closest 
marine isolates obtained from different and distant samples 
found till now. 

Materials and Methods 

Sample Collection and Sequencing 

Details of isolation and origin of the A. madeodii strains se- 
quenced in this study have been described in supplementary 
figure S1, Supplementary Material online. Briefly, A. madeodii 
MED64 comes from waters of the Aegean Sea near Lebanon 
(Pinhassi and Berman 2003), A. madeodii U4, U7, U8, U12, 
UM4b, UM7, and UM8 were isolated from the Ionian Sea at 



the Urania Basin (West of Crete) from three different 
depths (3,455, 3,475, and 3,500 m) (Sass et al. 2001). 
Finally, A. madeodii 61 5 comes from the L4 long-term coastal 
monitoring station in the Western English Channel 
(Southward et al. 2004). 

DNA was extracted by phenol-chloroform as described in 
Neumann et al. (1992) and checked for quality on a 1 % aga- 
rose gel. The quantity was measured using Quant-iT 
PicoGreen dsDNA Reagent (Invitrogen). The genomes were 
sequenced using the HluminaHiSeq 2000 (100-bp paired-end 
read) sequencing platform (Macrogen, Korea). The generated 
reads were trimmed and assembled de novo using VELVET, 
version 0.7.63 (Zerbino and Birney 2008). Combination of 
Geneious Pro 5.0.1 (with default parameters) using previously 
assembled genomes AltDE and AltDEI as a reference 
(Gonzaga et al. 2012) and oligonucleotides designed from 
the sequence of the ends of assembled contigs were used 
to obtain one single closed contig. 

Gene Prediction and Annotation 

Gene prediction of the assembled contigs was done using the 
ISGA pipeline (http://isga.cgb.indiana.edu/, last accessed June 
18, 2013). The predicted protein sequences were compared 
using BlastP to the National Center for Biotechnology 
Information (NCBI) nr protein database (e value: 10~ 5 ). 
Open reading frames (ORFs) smaller than 1 00 bp and without 
significant homology to other proteins were not considered. 
BioEdit was used to manipulate the sequences (Hall 1 999). GC 
content was calculated using the EMBOSS tool geecee (Rice 
et al. 2000). For comparative analyses, reciprocal BlastN and 
TBIastXs searches between the genomes were carried out, 
leading to the identification of regions of similarity, insertions, 
and rearrangements. To allow the interactive visualization of 
genomic fragment comparisons, Artemis v. 12 (Rutherford 
et al. 2000) and Artemis Comparison Tool ACTv.9 (Carver 
et al. 2005) were used to compare the genomes. ANI was 
calculated as defined before (Konstantinidis and Tiedje 
2005), using a minimum cutoff of 50% identity and 70% 
of the length of the query gene. Sequences were aligned 
using MUSCLE version 3.6 (Edgar 2004) and ClustalW 
(Thompson et al. 1994) and edited manually as necessary. 
The CGView application (Stothard and Wishart 2005) was 
used to plot the circular representations of the CF plasmids. 

SNP Analysis 

To further investigate the differences among A. madeodii 
strains, nucmer program in the MUMmer3+ package (Kurtz 
et al. 2004) was used to identify the indels and the SNPs 
between small regions of the genome such as genomic islands 
(GIs). The program uses exact matching, clustering, and align- 
ment extension strategies to create a dot plot based on the 
number of identical alignments between genomes. SNPs 
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between whole genomes were identified using SNPsFinder 
(Song et al. 2005). 

dNdS Analysis 

The ratio of nonsynonymous (d/V) to synonymous (d5) changes 
is one of the most widely methods used to quantify selection 
pressures acting on protein-coding regions. A low ratio 
(d/V/d5<1) indicates purifying selection, whereas a high 
ratio (d/V/d5>1) is a clear signal of diversifying selection. 
Orthologous protein sequence pairs were aligned using 
ClustalW and the protein alignments imposed upon the nu- 
cleotide sequences using the program pal2nal (Suyama et al. 
2006). For each sequence pair, pairwise d/V, d5, and 6N/6S 
indices were estimated by maximum likelihood using the 
codeml program (Yang 1998). 

Recombination 

Multiple alignment of genomic sequences for A madeodii 
strains was performed by using Mauve multiple alignment soft- 
ware (v2.3.1) (Darling et al. 2004). To detect potential recom- 
bination strains and possible recombination breakpoints 
(Martin et al. 2010), recombination detection methods imple- 
mented in the RDP4 (beta 16) software were used. To be 
considered as a reliable recombination event, the highest mul- 
tiple-comparison corrected P-value cutoff was set at 1 0 6 for at 
least four different methods embedded in the RDP program 
(RDP, GENECONV, SiScan, BootScan, MaxChi, Chimaera, and 
3Seq). 

For estimating mutation and recombination rates, the 
ClonalFrame software v1.2 was used. ClonalFrame is a 
Bayesian inference method that reconstructs clonal relation- 
ships between the isolates in a sample. Three independent 
runs of ClonalFrame were performed each consisting of 
100,000 Markov chain Monte Carlo iterations. To assess the 
relative contribution of recombination and mutation, r/m and 
p/0 statistics were used, p/0 is the proportion of rates at which 
recombination and mutation occur. It is therefore a measure 
of how often recombination events happen relative to muta- 
tions, r/m is the ratio of probabilities that a given site is altered 
through recombination and mutation and is therefore a mea- 
sure of how important the effect of recombination was in the 
diversification of the sample relative to mutation. 

Recruitments 

Recruitment plots of the genomes were carried out against 
some available marine metagenomes (Rusch et al. 2007; 
Coleman and Chisholm 2010; Quaiser et al. 2011). BlastN 
(Altschul et al. 1 997) was carried out between each A madeo- 
dii genome (615, AltDE, AltDEI, MED64, U4, U7, U8, UM4b, 
and UM7) and the environmental databases. A very restrictive 
cutoff of 99% of identity in 90% of the length of the envi- 
ronmental read was established to guaranty that only similar- 
ities at the level of nearly identical microbes were counted. The 



numbers of hits were normalized against the genomes and 
the database sizes. In the metagenome recruitment of AltDE, 
70% of identity in 50% of the length of the metagenomic 
read was used as a cutoff to construct the plots. 

Accession Numbers 

The genome sequences have been deposited in GenBank 
under the following accession numbers: CP004849 for 
A madeodii U4, CP004851 for A madeodii U7, CP004852 
for A. madeodii U8, CP004853 for A. madeodii UM7, 
CP004855forA madeodii UM4b, CP004848 for A madeodii 
MED64, and CP004846 for A madeodii 61 5. U4, UM7, and 
61 5 plasmid sequences have been also deposited at NCBI under 
the accession numbers CP004850, CP004854, and CP004847 

Results 

Diversity within the Deep Ecotype Core Genome 

Two genomes of isolates belonging to the deep ecotype 
clade of A. madeodii, AltDE and AltDEI, obtained from 
1,000 m deep in the South Adriatic have been already 
described (Ivars-Martinez, Martin-Cuadrado, et al. 2008; 
Gonzaga et al. 2012). We report here the genome sequence 
of seven new strains of A. madeodii isolated from different 
locations throughout the Mediterranean and one from the 
English Channel (Atlantic Ocean) (supplementary fig. S1, 
Supplementary Material online). They all belong to the 
"deep ecotype" clade (Lopez-Lopez et al. 2005; Ivars- 
Martinez, D'Auria, et al. 2008). This clade is quite divergent 
from the A. madeodii "surface clade" (Lopez-Perez et al. 
2012) and, in spite of the high similarity of the 16S rRNA 
gene (>98%), could actually belong to a separate species 
(ANI below 85% over the core genome) (Ivars-Martinez, 
Martin-Cuadrado, et al. 2008) (supplementary fig. S2, 
Supplementary Material online). Although it seems contradic- 
tory, some representatives of the deep clade have been 
actually isolated from surface waters. We already proposed 
(Ivars-Martinez, Martin-Cuadrado, et al. 2008) that the 
"deep" clade is a group of strains adapted to live on larger 
fast-sinking particles and that this explains their frequent iso- 
lation from deep Mediterranean waters, although it can be 
found in surface samples as well. Most of the genomes re- 
ported here (strains designation starting by U) belong to iso- 
lates that were obtained from a much deeper sample 
(-3,500 m) in the Urania Basin (Sass et al. 2001) located ap- 
proximately 1 ,000 km away from the South Adriatic sampling 
site where AltDE and AltDEI were obtained. Besides, the U 
isolates were retrieved 5 years earlier than the Adriatic ones 
and come from three separate samples taken at approximately 
20-m intervals along the water column (see Materials and 
Methods). The two additional genomes belong to strains iso- 
lated from surface samples, MED64 was isolated from coastal 
waters off Israel (Eastern Aegean) (Pinhassi and Berman 2003), 
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Table 1 



General Features of "Deep Ecotype" Alteromonas madeodii Genomes 



Genomic Features 


AltDEI 


UM7/UM8 


UM4b 


U4 


U7 


U8/U12 


615 


AltDE 


MED64 


CF 


1 


1 


1 


2 


2 


2 


3 


4 


5 


Size (bp) 


4,643,844 


4,628,423 


4,407,104 


4,487,701 


4,442,936 


4,395,035 


4,389,661 


4,480,937 


4,397,537 


Total ORFs 


4,347 


4,350 


4,128 


4,182 


4,113 


4,062 


4,546 


4,315 


4,050 


Shared genes 


2,828 


2,826 


2,832 


2,919 


2,801 


2,799 


2,804 


2,806 


2,822 


Unique genes 


12 


6 


85 


77 


104 


80 


853 


480 


338 


Plasmid 


pAMDEI 


pAMDEI 


— 


pAMDEI 


— 


— 


pAMEC615 


— 


— 


ICE 


ICEAmaASI 


ICE/4/r?aAS1 


ICEAmaASI 


— 


— 


— 


— 


— 


ICEAmaAgSI 


ANI (%) a 


— 


99.97 


99.96 


98.54 


98.55 


98.54 


98.37 


98.57 


98.49 


No. SNPs 





77 


87 





48 


116 


34,884 


23,225 


34,117 


KJK s b 





0.002 


0.003 


0.133 


0.135 


0.138 


0.187 


0.172 


0.142 


Identical metagenomic fosmids c 




90 






14 




5 


34 


9 


fGIs 




















fGI-3: flagellum glycosylation 


1 d 


1 


1 


2 


2 


2 


2 


3 


4 


fGI-4: EPS 


1 


1 


1 


2 


2 


2 


3 


4 


5 


fGI-6: LPS O-chain 


1 


1 


1 


2 


2 


2 


3 


4 


5 


Giant protein 


1 


1 


1 








1 


1 




fGI-1: metal resistant/hydrogenase 


1 


1 


2 


3 


4 


5 


6 


7 


8 


fGI-2: integron 


1 


2 


3 


4 


5 


6 


7 


8 


9 


fGI-5: urease/CRISPR 


1 


1 


2 


3 


4 


5 


6 


7 


8 


fGI-7: ND 


1 


1 


2 


3 


3 


3 


4 


5 


6 


fGI-8: glycosy transferases 


1 


1 


2 


2 


3 


4 


5 


6 


7 


fGI-9 MGI 


1 


1 


2 


3 


3 


3 


4 


5 


6 



Note. — ND, not determined; EPS, exoplipopolysaccharide; LPS, lipopolysaccharide. Shading in column 1 has been used to highlight the replacement fGIs. For the number 
of SNPs, AltDEI was used as a reference genome in CF1, CF3, CF4, and CF5; for CF2, U4 was the reference genome. 
a ANI (Konstantinidis and Tiedje 2005) to AltDEI homologous genes. 

b Ratio of the number of nonsynonymous substitutions per nonsynonymous site (K a ) to the number of synonymous substitutions per synonymous site (AO. 
c Number of identical fosmids found in an Adriatic metagenome (Gonzaga et al. 2012). 
d Numbers identify the different versions (different gene content) of the fGI. 



and 61 5 from the L4 long-term coastal monitoring station off 
Plymouth (English Channel) (Southward et al. 2004). In spite of 
the different locations and environmental conditions of the 
sampling sites, the isolates form a highly homogeneous clade 
with average nucleotide identities over 98% (table 1). There 
were two pairs of genomes (U8-U12 and UM7-UM8) that 
were identical, both pairs were retrieved from the same 
sample and can be considered resequencing of the same 
strain. The remaining genomes had a gradient of similarities 
that is illustrated in figure 1 and table 1 . The genomes have 
synteny over most of the core genome that includes 2,614 
genes and can be classified into five clonal frames (CFs). We 
use the term CF to describe bacterial lineages of common an- 
cestry and apparent clonal (asexual) descent but in which re- 
placement of genome fragments by recombination, selection, 
and drift by neutral genetic mutations has occurred (Milkman 
and Bridges 1990). Two of them, CF1 and CF2, comprised 
three different strain genomes each, and the remaining 
three were represented by one single genome each. 

Strains belonging to the same CF diverged between 48 
and 116 SNPs over the core genome, whereas those belong- 
ing to different CFs differed by between 25,576 and 31,564 
SNPs, indicating that they must have diverged much earlier. 
Most SNPs were evenly distributed throughout the genomes 



and originated synonymous replacements. However, when 
comparing different CFs, some hotspots with high 6N/6S 
values could be identified (fig. 1 ). These hotspots were located 
at different genes in the different strains, and we could not 
discern any obvious pattern from the types of genes or their 
location. The 6N/6S values were higher (-0.1) among CFs 
than within CFs (0.002), what seems contradictory (Rocha 
et al. 2006). However, the ClonalFrame analysis (see later) 
indicates that recombination is also much more common in 
close relatives, so many of the synonymous replacements 
found within the same CF can be due to recombination 
rather than to point mutations. 

There have been reports of frequent recombination among 
A. madeodii strains, some even spanning the two clades 
(Lopez-Lopez et al. 2005; Ivars-Martinez, D'Auria, et al. 
2008). Besides, in the previous work comparing only two 
strains and metagenomic fosmids (Gonzaga et al. 2012), 
there was evidence indicating that recombination happened 
mostly in close proximity to the fGIs (Gonzaga et al. 2012). 
However, the access to complete genomes allows for a more 
reliable and comprehensive assessment. We could generate 
genome alignments of more than 1.5 kb for 537 locally col- 
linear blocks (65,8% of the core genome) shared by the 13 A 
madeodii strains sequenced presently, including strains 



Genome Biol. Evol. 5(6): 1220-1 232. doi:10.1093/gbe/evt089 Advance Access publication May 31, 2013 



1223 



Lopez-Perez et al. 



GBE 




#SNPs 



Fig. 1. — Alignment of Alteromonas madeodii deep ecotype available genomes. They are arranged by numbers of SNPs along their core genomes 
starting from the bottom genome of MED64. Numbers of SNPs are indicated to the right and the amount corresponds to the comparison of the two strains 
connected by the arrow. Vertical green lines indicate SNPs among the strains within a CF, whereas white ones mark genes with ratio d/V/d5> 1 between 
pairs of strains belonging to different CFs. Colored rectangles to the left delineate the genomes belonging to the same CF. Arrows show location of lysogenic 
phages inserted within the genomes, same color indicate same phage. ICEs are highlighted with blue rectangles. Some fGIs identified in the comparison 
between AltDE and AltDEI (Gonzaga et al. 2012) have been highlighted in purple, identified by the inferred function (on top). 



belonging to the surface clade. In total, 143 recombination 
events were identified (supplementary fig. S3, Supplementary 
Material online) that appeared spread along the core genome. 
There were some chromosomal recombination hotspots, but 
they do not appear to be more frequent near the fGIs. Two 
examples of the maximum-likelihood (ML) trees generated 
using these regions are shown in supplementary figure S3, 
Supplementary Material online. 

To confirm the high level of recombination detected 
among CFs, we generated ML trees of different genomic re- 
gions including the alignable parts of the fGIs and compared 
them to the consensus tree generated by all the alignable 
regions in the genomes (core). The results shown in figure 2 
illustrate that the topology varies depending on the regions 
selected, near fGIs, or in the core. This confirms that recom- 
bination events often break the clonal structure of the 



population as has been described for genomes of V. cyditro- 
phicus (Shapiro et al. 2012). Furthermore, recombination 
events often broke the line between surface and deep isolates. 
To assess the relative effect of recombination and mutation, 
we have used Clonal Frame software v1 .2 to estimate the fre- 
quency of recombination relative to mutation (p/0) and the 
weight of recombination on diversification relative to muta- 
tion (r/m) (Didelot and Falush 2007). The mean estimate in- 
cluding all the surface and deep clade representatives of p/9 
and r/m ratios were 0.06 and 0.45, respectively. These results 
indicate that in spite of the frequent recombination, mutation 
is much more frequent than recombination. Similar p/9 and 
r/m ratios were estimated for the strains within the surface 
clade and among the members of the CF1 (supplementary 
table S1, Supplementary Material online). However, within 
representatives of the deep clade and the strains within 
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(2,820,463 bp) 

■ ATCC 27126 O 
-T AltDEl (CF 1) • 

J L UM7(CF1)» 
L UM4b(CF 1) • 

J- AltDE (CF 4) 
~|_rMED64(CF5) < 

^-615(CF3) • 
j-U8(CF2) • 
TrU4(CF2)» 

n -U7(CF2) • 



Flagellum 
(42,214 bp) 
I 



r ATCC 27126 O 
|-AltDE(CF4) • 
i-MED64(CF5) • 
-|_ r UM4b (CF 1)9 
Lr AltDEl (CF 1) < 
LUM7(CF1) • 
.- 615 (CF 3) • 
H_r U8 (CF 2) • 
LrU4(CF2) • 
l-U7(CF2) • 



O-chain 
(40,399 bp) 

I ' i 

I— ATCC 27126 O 
r-615 (CF 3) • 
-L UM4b (CF 1)# 
lr AltDEl (CF 1) • 
1- UM7(CF1)« 
U8(CF2) • 
U4 (CF 2) • 
U7(CF2) • 
LrAltDE (CF 4) 
T-MED64(CF5) • 



rt 



fGI7 
(23,166 bp) 



I— ATCC 27126 O 

rtU8(CF2) • 
U4(CF 2) • 
U7(CF 2) • 

rtUM4b(CF 1)9 
AltDEl (CF 1) • 
UM7(CF 1)# 
|-AltDE(CF4) • 
|rMED64(CF5) • 
T-615 (CF 3) • 



AltDEl 



■H- 



~~ i — 

1,5 



- 1 — 

3,5 




- ATCC 27126 O 
j- UM7(CF 1)9 
lr AltDEl (CF 1) • 
^- UM4b(CF 1) • 
AltDE (CF 4) 
MED 64 (CF 5) • 
615 (CF 3) • 
U4(CF 2) • 
U7(CF2) • 
U8(CF2) • 

(15,457bp) 




Genome (Mbp) 



ATCC 27126 O 
|- AltDE (CF 4) • 
j- AltDEl (CF 1) • 
"L|-UM4b (CF 1)9 
r ^-UM7(CF1) • 
|_r- 615 (CF 3) 



4? 



MED64(CF 5) • 
U8(CF2) • 
U4(CF 2) • 
U7(CF2) • 




(9,533 bp) 



615 (CF 3) • 
U7(CF 2) • 
U4(CF 2) • 
U8(CF2) • 
UM7(CF1) • 
AltDEl (CF 1) • 
UM4b(CF 1)9 
AltDE (CF 4) • 
MED 64 (CF 5) • 

(12,211 bp) 



- ATCC 27126 O 
j-UM7(CF 1) • 
~|_r AltDEl (CF 1) • 
n -UM4b (CF 1)9 
r~ AltDE (CF 4) 
H |-615(CF3) • 
H pMED64(CF5) • 
L | r U4(CF2)« 
Hj- U7(CF 2) • 
U8(CF2) • 



(8,598 bp) 



4,5 



Core Samples 



Fig. 2. — ML trees of the nine Alteromonas madeodii deep ecotype representatives. Type strain A. madeodii ATCC 271 26 was used as an outgroup to 
root the tree. Members of different CFs within the deep clade are labeled with color-coded dots. Trees below the genome were calculated based on aligned 
randomly selected core genome regions. The trees above the genome correspond to the alignable parts of some fGls. The nucleotide lengths of the 
alignments are indicated between brackets for each tree. In the box is the consensus tree based on 2.8 Mbp of aligned core genome. AltDEl was used as a 
reference genome to locate the position of the sequence used to generate the trees. 



CF2, the latter all coming from the same location, ratios 
of recombination-associated replacements were much 
higher (r/m = 5.1 3 and 8.61, respectively). These results sug- 
gest that, although recombination was less frequent than mu- 
tation, the weight of recombination for the total numbers of 
nucleotide replacements was quite significant, confirming pre- 
vious observations (Lopez-Lopez et al. 2005; Ivars-Martinez, 
D'Auria, et al. 2008). 

Plasmids, Conjugative Elements, and Phages 

The plasmids and lysogenic phages were very variable geno- 
mic features. The 300-kb conjugative plasmid pAMDEI pre- 
viously described by Gonzaga et al. (2012) in the Adriatic Sea 
isolate AltDEl was also found with identical sequence in the 
Urania isolate UM7, also from CF1 , and in U4 that belongs to 
CF2. However, in the third representative of the CF1 (UM4b), 
it was lost. The perfect conservation of the plasmid sequence 
indicates recent transfer. An important feature of this plasmid 



was the presence of a hybrid polyketide and nonribosomal 
peptide (NRPS-PKS) cluster of 65 kb. This cluster is flanked 
by IS elements, suggesting that it could be a mobile genetic 
element (Gonzaga et al. 2012). Interestingly, within the 
genome of strains U7 and U8 of CF2, the plasmid was not pre- 
sent but an insertion of the same NRPS-PKS cluster was found 
in the chromosome (supplementary fig. S4, Supplementary 
Material online). The insertion was located next to the single 
Phe-tRNA, an insertion target producing high variability in all 
the known strains of A madeodii, including those of the sur- 
face clade (Lopez-Perez et al. 2012). A completely different 
plasmid pAMEC61 5 was found in 61 5. The plasmid was con- 
firmed by polymerase chain reaction (PCR) as a circular repli- 
con of approximately 200 kb (supplementary fig. S4, 
Supplementary Material online). A fragment of 21 kb flanked 
by transposases in this plasmid is identical to a Gl identified in 
the chromosome of A madeodii 673, a surface clade strain 
obtained from the same water sample (Lopez-Perez et al. 
2012). This Gl was next to a tRNA and is probably involved 
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in benzoate degradation (supplementary fig. S4, Sup- 
plementary Material online). Another 19-kb region (mostly 
hypothetical proteins) in this plasmid was nearly identical to 
a region in the chromosome of Alteromonas sp. SN2, an iso- 
late from intertidal sediments in Korea that is only distantly 
related to A. madeodii. Altogether, these results reflect the 
frequent exchange of plasmids or plasmid fragments en- 
compassing a taxon at the genus level and of global 
distribution. 

A similar dynamic distribution was found for the integrative 
and conjugative elements (ICE). ICEs are also mobile genetic 
elements transferable by conjugation, but, unlike plasmids, 
they are always integrated into the chromosome. All the ge- 
nomes of CF1 contain the same ICE/\/raAS1 (supplementary 
fig. S5, Supplementary Material online) already reported in 
AltDEI (Gonzaga et al. 2012). However, although the one 
in UM7 was identical to that of AltDEI, the one in UM4b 
has an insertion of three new genes located in a "hotspot" 
(Hs) (Beaber et al. 2002). A different ICE, which also belongs 
to the SXT/R391 family (Wozniak et al. 2009) was found in 
strain MED64. Synteny of the core ICE genes was well pre- 
served (supplementary fig. S5, Supplementary Material 
online), but the MED64 Hs' regions were more similar to Hs' 
found in different V. cholera species. The SXT/R391 ICE family 
shares the same chromosomal integration site, the 5 r -end of 
prfC, which encodes peptide chain release factor 3. However, 
in strain U4 of CF2, a 15-kb ICE-like region, similar to the 
ICE/\/raAS1, was found at a different genomic location (sup- 
plementary fig. S5, Supplementary Material online). This ICE- 
like region has only 4 of the 14 tra genes found in the 
A. madeodii ICEs and a part of the Hs2 region found in 
ICE/\/raAS1 . Overall, the ICE is one of the most dynamic re- 
gions in the chromosome with frequent changes within CFs. 

The strains AltDEI and UM7 have both a lysogenic phage 
inserted at the same position in the genome. This insertion 
was positively identified as a lysogenic phage thanks to the 
presence of a CRISPR spacer with identical sequence in AltDE 
(Gonzaga et al. 2012). The presence of this phage in the two 
strains separated by site and time of isolation indicates that 
phages can stay in the lysogenic state for a long time in 
nature. On the other hand, UM4b also in CF1 was devoid of 
this lysogenic phage. In the three CF2 strains, a different 
phage, similar to E. coli prophage CP4-57 (Kirby et al. 
1994), was found inserted at a tRNA-Leu (supplementary 
fig. S6A, Supplementary Material online). The att site could 
be detected due to the similarity to the site-specific integration 
of similar prophage found in Haemophilus infuenzae (Hauser 
and Scocca 1 992; Wang et al. 2009) (supplementary fig. S6B, 
Supplementary Material online). MED64, the only representa- 
tive of CF5 contained yet a third different phage that showed 
clear similarities to the lambdoid £ coli HK97 phage (Juhala 
et al. 2000), no obvious attachment site or insertion target 
was found (supplementary fig. S6C, Supplementary Material 



online). None of the two new phages detected here showed 
homology to any AltDE CRISPR spacers. 

Flexible Genomic Islands 

The main differences found among CFs were the presence of 
different fGls. Recently, A. madeodii fGIs were defined as re- 
gions detected when comparing closely related strains that 
have similar location and inferred function but contain differ- 
ent genes (Gonzaga et al. 201 2). The availability of more than 
one genome for some CFs allows the distinction of two types 
of fGIs that have probably different mechanisms of 
diversification. 

The most obvious fGIs have a pattern of variation that could 
be described as "complete replacement." In this kind, a clus- 
ter of genes is replaced by another totally different set, albeit 
coding for a similar or related function. In general, there is very 
little similarity, if any, among the genes present in the equiv- 
alent replacement fGI found in different CFs, like if the region 
had been completely replaced. In our case, the genes of this 
type of fGIs encoded the synthesis of exposed structures of the 
cell such as the lipopolysaccharide O-chain, the exopolysac- 
charide, and the f lagellum (fGI3, 4 and 6) (table 1 ). Basically in 
all these cases, the fGIs present in different CFs have sets of 
genes that produce different sugar skeletons that form or 
decorate the exposed structure. As previously discussed 
(Rodriguez-Valera et al. 2009; Gonzaga et al. 2012; 
Rodriguez-Valera and Ussery 2012), this fGIs could generate 
different phage recognition targets in the population diluting 
the predation pressure of these viruses. These fGIs were very 
well conserved within CFs but were always different for dif- 
ferent CFs (table 1). However, we found some interesting ex- 
ceptions that might help understanding the mechanisms of 
variation at play in these genomic regions. One of them was 
found affecting the flagellum glycosylation fGI of strain 615 
and the three strains in CF2. This fGI is located in the middle of 
the large flagellum gene cluster (Gonzaga et al. 2012) and 
contains flagellar structural genes that are exposed to the en- 
vironment and genes involved in flagellin glycosylation (fig. 3). 
Flagellin glycosylation has been described in several bacterial 
species and is an essential modification, allowing both flagellar 
assembly and function (Logan 2006). This fGI was different for 
each CF except CF2 and CF3 for which the sequence found in 
this region was identical (fig. 3). It is remarkable that CF3 is 
represented by the single isolate 615 obtained from the 
English Channel, whereas all the members of CF2 come 
from the Urania Basin. The genomes of 615 and CF2 strains 
differ by an average of 1 1 SNPs per kb. However, only one 
SNP was found in this region of 17 kb. Contrastingly, 
sequence analysis of genes flgti and flgl located before the 
5'-end of the island showed a large accumulation of SNPs, 
even though they code for the essential L- and P-rings of the 
flagellar basal body (fig. 3). A similar case of shared gene 
clusters (or rather in this case of a large gene) by different 
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Fig. 3. — Comparison of the gene cluster involved in the biosynthesis and assembly of flagellar components, with the f lagellum glycosylation island fGI 
found in different CFs. The plots above the 61 5 cluster indicate number of SNPs in a 1 00-bp window when compared with the three CF2 strains (all identical 
over this fGI). The average number of SNPs in the genome is indicated by the white line. The region enlarged in the red box indicates the rate of 
nonsynonymous (dN), synonymous (dS) substitutions and the d/V/d5 ratios. 



CFs was found for the giant protein (supplementary fig. S7, 
Supplementary Material online). Although this gene cannot be 
defined as an fGI because it was not present in all the strains, 
this Gl was found in CF1, CF3, and CF4. It contains mostly a 
single gene coding for a "giant protein" of 6,573 aa. Most of 
these giant proteins have been identified in nonpathogenic 
environmental bacteria as large cell-surface glycoproteins 
(Reva and Tummler 2008). The large protein sequence con- 
tains several VBCS repeats found in other giant proteins and, 
although their function is not well understood, they are 



believed to function in protein-protein and protein- 
carbohydrate interactions. SNPs analysis revealed that 615 
have the most divergent version of these giant proteins. 
However, between CF1 and AltDE, most of the giant protein 
gene is identical except for the region that contains the VBCS 
repeats (supplementary fig. S7, Supplementary Material 
online). Like in the case of the flagellum glycosylation fGI, 
a large number of synonymous SNPs were found at the 
5'-boundary of the island (fig. 3 and supplementary fig. S7, 
Supplementary Material online). This unusually high numbers 
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Fig. 4. — f GI9 or MGI. (A) Comparison of gene content of the MGI in the strain genomes. The attachment sites, left (attL) and right (attR), are highlighted 
in red, the integrase is highlighted in yellow, and the red arrow indicates the position of the conserved sequence that mimics the origin of transfer (or/7). 
Predicted ORFs with the same color are involved in the same function. (B) Alignment of the attachments sites attL and attR. 



of concentrated SNPs could be due to either positive selection 
or recombination. However, the highly significant enrichment 
of synonymous SNPs (fig. 3) suggests that the increase in SNP 
frequencies result from recombination events with divergent 
genomes. On the other hand, the nearly absolute sequence 
conservation along the swapped regions indicates that they 
have been exchanged quite recently, before a significant 
number of SNPs could accumulate. 

The other kind of fGI, that could be called "additive," con- 
tain variable numbers of gene cassettes in the different strains 
giving rise to very variable sizes but part of the fGI remains 
conserved. Most of the variability found within each CF was 
derived from changes that affected additive fGIs (table 1). 



Typically, these changes involve sequential insertions at a spe- 
cific site such as tRNA gene as in the case of the fGI1, metal- 
resistant/hydrogenase island that has been already described 
(Gonzaga et al. 2012), or specific mobile elements such as an 
integron (fGI2, see later) or a mobilizable Gl (MGI) fGI9. As an 
example of this kind of island (fig. 4), fGI9 is a typical MGI, 
conjugative mobile elements that utilize the conjugation ma- 
chinery of ICEs, or plasmids for their transfer (Daccord et al. 
2013). Interestingly, the size of this fGI in strains having ICEs 
(CF1 and CF5) was over twice as large as in the others. 
Although within CF2, this fGI was conserved, in CF1, a com- 
plete set of genes related to the Entner-Doudoroff pathway 
flanking by transposases was identified in AltDEl and UM7 
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but not in UM4b. Finally, some of the additive fGIs (5, 7, and 
8), although always rich in transposable elements, could not 
be characterized at the level of the mechanisms providing 
variability. 

Presence of the CFs at the Different Sites 

The most similar genomes by SNPs, aside from the identical 
ones mentioned before, were U4 and U7 both in CF2. 
Interestingly, a similar degree of identity was found between 
AltDEI (from the South Adriatic) and UM7 (from Urania), in 
spite of the time and space distance of the isolation of these 
two strains. Actually, when considering the fGIs, the pair from 
the South Adriatic and the Ionian AltDEI and UM7 was the 
most similar (table 1). The finding of two nearly identical 
clones isolated from different locations and at different 
times has not been reported yet outside of clinical settings 
(Reeves et al. 2011). The major difference between AltDEI 
and UM7 was found in fGI2 (Gonzaga et al. 2012). This var- 
iable region was the only f Gl that was different in all the strains 
sequenced (table 1), and its high variability was due to the 
presence of an integron. Integrons are mobile DNA elements, 
and their core structure consists of a gene that codes for an 
integrase, lnt\ f and a proximal primary recombination se- 
quence called att\ site that allow bacteria to capture and ex- 
press gene cassettes (Mazel 2006; Boucher et al. 2007). 
Multiple alignments of fGI2 showed that the integrase was 
identical for all the strains sequenced and possessed an iden- 
tifiable single atti site, typical of class 1 integrons (Partridge 
et al. 2000) (data not shown). Inclusion of different cassettes 
makes the length variable among the strains, going from 5 kb 
in MED64 (that contains only one cassette) to 27 kb in AltDE. 
The function of most of these cassettes is unknown. Strains 
AltDEI two cassettes and UM7 single cassette are completely 
different (supplementary table S2, Supplementary Material 
online), illustrating the highly dynamic nature of this genomic 
region. Aside from the SNPs, the only other difference be- 
tween these two strains was found in fGI8 (containing glyco- 
syltransferases). In CF1 , four different thrombospondin type 3 
repeat family protein genes have been found in this fGI. 
Thrombospondins are multimeric multidomain glycoproteins 
that function at the cell surface (Lawler and Hynes 1986). 
UM7 and UM4b have lost an internal repeat (459 nucleotides) 
in the first thrombospondin-like protein compared with 
AltDEI (data not shown). This could be attributed to intrage- 
nomic recombination or duplication-related errors due to the 
repetitive nature of this region. Summarizing, the two differ- 
ences in gene content found between the closest pair of iso- 
lates are 1 ) the acquisition of different cassettes by an integron 
and 2) a deletion of approximately 450 nucleotides due likely 
to an intragenomic recombination event. 

Another way to gauge the presence of a CF at different 
sites is by comparing the genomes with metagenomes where 
the microbes are abundant (Gonzaga et al. 2012). 



Unfortunately, the presence of A madeodii in most metage- 
nomic data sets such as the Global Ocean Survey (Rusch et al. 
2007) is too small to allow for a precise estimation of the 
presence of the different CFs detected here at different loca- 
tions. There are, however, two data sets that are rich enough 
in biomass of this microbe to assess the presence of the dif- 
ferent CFs. One is a large metagenomic fosmid library built 
with biomass retrieved from the same sample from which 
AltDE and AltDEI were isolated (Gonzaga et al. 2012). From 
this fosmid library (38,704), 245 fosmids could be assigned to 
environmental A madeodii by similarity to the genomes avail- 
able, and 161 were fully sequenced. The availability of the 
new genomes allows to assign the fosmids to the five CFs 
described here (table 1). Although most of the fosmids are 
more similar to the two strains isolated from the same loca- 
tion, there are significant numbers of them that seem more 
related to CF 2 or to the CFs represented by 61 5 and MED64. 
This could be taken as an indication of the presence of all the 
CFs at this location although in very different proportions. One 
direct pyrosequencing metagenome carried out with a sample 
from the deepest Mediterranean basin Matapan-Varilov 
(Smedile et al. 2012) contained large amounts of A madeodii 
reads and recruitment analysis indicated that most of the bio- 
mass here corresponded to clones highly related to the CF 
represented by AltDE because even the fGIs of this strain re- 
cruited at high similarity (supplementary fig. S8, 
Supplementary Material online). 

Discussion 

The genomes described here have been sequenced at high 
coverage, and the assembly has been confirmed by PCR pro- 
viding a very high quality and reliability of data. Furthermore, 
for isolates UM7 and U8, the independent sequencing of iden- 
tical strains (UM8 and U12, respectively) provided an even 
more robust confirmation of the genome data, very important 
when comparing highly similar genomes. This allows a reliable 
description of the gradient of similarity, revealing the way 
these microbes change in the short to medium time scale in 
a marine planktonic habitat. It is difficult to get an accurate 
time frame for the divergence of the most recent common 
ancestor (MRCA) of the isolates studied here, but recent work 
using genomes of isolates from clonal expansion of patho- 
genic bacteria provide some relevant numbers. For example, 
uropathogenic E. coli isolates obtained within a single house- 
hold showed 1.1 SNPs per genome and year (Reeves et al. 
2011). However, probably the most relevant estimation for 
our case is that of Mutreja et al. (2011) that analyzed the 
genomes of 1 13 monophyletic 7th pandemic V. cholera iso- 
lates. These authors calculated a ratio of 3.3 SNPs per genome 
and year. Although these are pathogenic clones isolated from 
human patients, V. cholera survive in natural waters, is a close 
phylogenetic relative of A madeodii, and has probably a sim- 
ilar lifestyle. If we accept this value for the rate of change of 
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the core genome of A madeodii in the marine environment 
(an admittedly risky extrapolation), then the CFs diverged all at 
a similar timeframe between 7,000 and 10,000 years ago. 
This figure is quite meaningful because the Mediterranean 
Sea and even the global ocean has not changed much since 
the end of the last glacial period about 10,000 years ago 
(Annan and Hargreaves 2013). Since then, temperature wise 
at least, conditions have remained quite constant. Therefore, 
this would have been a reasonable starting point for the CFs of 
the A madeodii deep clade to radiate. 

Strains belonging to the same CF diverged between 48 and 
116 SNPs over the core genome. These are remarkably small 
figures for free living independent isolates. A recent report 
for two Listeria genomes obtained from two food packaging 
facilities 6 years apart had 18 SNPs difference (Holch et al. 
2013). In a widespread study of methicillin-resistant 
Staphylococcus aureus, the most similar strains were different 
at 14 SNPs but were obtained only 11 weeks apart (Harris 
et al. 2010). The L2 El Tor V. cholera isolates obtained over 
53 years mentioned earlier (Mutreja et al. 201 1) had between 
50 and 250 SNPs. Interestingly, the two most similar genomes 
found here were isolated from different locations. Both sam- 
ples came from the Eastern Mediterranean, and the deep 
Adriatic and Ionian are communicated by currents. 
However, the main connection flows in the opposite direction 
of the collection date (Menna and Poulain 201 0), that is, from 
the deep Adriatic taken in 2003 to the deep Ionian taken in 
1 998. Furthermore, the samples were also taken in the oppo- 
site direction of the sinking flow (first the 3,500 m and later 
the 1 ,000 m). All this oceanographic parameters indicate that 
the water masses of origin of both isolates are not mixed 
rapidly, and the MRCA must be at least decades old. Just 
the isolation date, 5 years apart, gives a minimal framework 
for the endurance of this clonal lineage. Using the 3.3 SNPs/ 
year described earlier, the MRCA of AltDEI and UM7 (87 
SNPs) would be 26 years older than the strains. Actually, 
even the most hypervariable regions are quite well conserved 
between these two genomes, including all the fGIs and even 
the lambda-like lysogenic phage, providing one of the largest 
examples of persistence in nature of a phage in a lysogenic 
state. The main differences found between these two strains 
were due to the insertion of different gene cassettes in an 
integron, what indicates that this could be the most rapidly 
changing type of mobile genetic element. Incidentally, the 
presence of totally different genes at this site prove also that 
they do not belong to a single laboratory clone, that is, that 
they are real and independent natural isolates and not the 
results of laboratory contamination. On the other hand, the 
other genome within CF1, UM4b, with likely a few decades 
older MRCA had no phage inserted and many more differ- 
ences affecting some of the additive fGIs. Actually five out of 
seven fGIs of this type identified had already changed 
(table 1). 



Recently, it has been proposed for V. cyclitrophicus strains 
(Shapiro et al. 2012) that the main driver of divergence is the 
higher rate of recombination taking place among strains that 
share a similar habitat (Shapiro et al. 201 2). Along these lines, 
we have found a higher impact of recombination within the 
CFs than between them. Furthermore, for isolates from the 
same location, such as those of CF2, the impact of recombi- 
nation was even higher. It seems clear that these groups of 
gammaproteobacteria are very recombinogenic, and the 
abundance of conjugative elements such as plasmids, ICEs, 
MGIs, or lysogenic phages provide plenty of mechanisms to 
exchange genome fragments between cells. However, in 
Alteromonas, there are also many examples of significant ge- 
nomic exchange between the surface and deep clades and 
even with other species such as Alteromonas sp. SN2. 

Prominently, the degree of nucleotide divergence seems to 
have little effect on the recombination events that precede the 
replacement of fGIs. Particularly, the presence of identical ver- 
sions of the flagellum glycosylation fGI3 and the giant protein 
Gl in very different genomic backgrounds indicates that in 
these cases recombination can take place between distant 
relatives. A similar complete replacement of a very large pro- 
tein by homologous recombination has been described in 
V. cholera where also a high density of SNPs was followed 
by this gene cluster (Mutreja et al. 2011). In our case, the 
identity of the fGIs was so high that the exchange must 
have happened very recently and indicates that replacement 
fGIs have a fast turnover. Also the low values of 6N/6S ratio of 
the high variability site upstream from these fGIs indicate that 
they are frequently subjected to recombination. In these cases, 
rare recombination events maybe favored by the strong selec- 
tive pressures to evade phage predation. The change of any of 
these exposed structures would make the clones resistant to 
some of the phages preying upon them. It is remarkable that a 
very similar pattern of replacement of fGIs has been found in 
113 genomes of V. cholera (Mutreja et al. 201 1) in which also 
within a background of very little core variation there was a 
replacement of the O-chain polysaccharide gene cluster and 
the giant protein gene described before. 

The availability of complete and fully assembled genomes 
of closely related strains opens a window into the dynamics of 
variation of prokaryotic genomes. Studies of free-living mi- 
crobes such as Alteromonas are particularly relevant because 
they provide information that applies to ecologically relevant 
microbes and help also in understanding the more complex 
cases of pathogenic bacteria that have reservoirs in natural 
habitats. The different rates and mechanisms of variation of 
the core and flexible genomes illustrate how prokaryotic cells 
balance the needs for change and conservation. The presence 
of multiple concurrent CFs of A. madeodii has already been 
explored (Gonzaga et al. 2012). Here, we have proven the 
presence of some of them at different locations, expanding 
the model (Rodriguez-Valera et al. 2009; Rodriguez-Valera 
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and Ussery 2012) and refining the description of the patterns 
of variation among the different CFs. 

Supplementary Material 

Supplementary figures S1-S8 and tables S1 and S2 are avail- 
able at Genome Biology and Evolution online (http:/A/vww. 
gbe.oxfordjournals.org/). 
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